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Abstract .  We  consider  the  quantization  of  a  class  of  non  bandlimited  signals,  namely  the  class  of  discrete  time 
signals  that  can  be  recovered  from  their  decimated  version.  Based  on  recent  results,  the  signals  of  interest  are 
assumed  to  be  the  output  of  a  single  interpolation  filter  (single  band  model)  or  more  generally  the  sum  of  the 
outputs  of  L  interpolation  filters  (multiband  model).  By  definition,  these  signals  are  oversampled  and  it  is  rea¬ 
sonable  to  expect  that  we  can  reap  the  same  benefits  of  well  known  efficient  A/D  techniques.  In  fact,  by  using 
appropriate  multirate  models  and  reconstruction  schemes,  we  first  show  that  we  can  obtain  a  great  reduction  in  the 
quantization  noise  variance  due  to  the  oversampled  nature  of  the  signals.  Alternatively,  we  also  show  that  we  can 
achieve  a  substantial  decrease  in  bit  rate  by  appropriately  decimating  the  signals  and  then  quantizing  them.  To 
further  increase  the  effective  quantizer  resolution,  noise  shaping  is  introduced  by  optimizing  pre-  and  post  filters 
around  the  quantizer.  We  start  with  a  scalar  time  invariant  quantizer  and  study  two  important  cases  of  LTI  filters, 
namely  the  case  where  the  postfilter  is  the  inverse  of  the  prefilter  and  the  more  general  case  where  the  postfilter  is 
not  related  to  the  prefilter.  Closed  form  expressions  for  the  optimum  filters  and  minimum  mean  squared  error  are 
derived  in  each  case  for  both  the  single  band  and  multiband  models.  Due  to  the  statistical  nature  of  the  signal  of 
interest,  the  class  of  noise  shaping  filters  and  quantizers  is  then  enlarged  to  include  linear  periodically  time  varying 
(LPTV)m  filters  and  periodically  time  varying  quantizers  of  period  M.  Because  the  general  (LPTV)m  case  is 
difficult  to  track  analytically,  we  study  two  special  cases  in  great  detail  and  give  complete  solutions  for  both  the 
single  band  and  multiband  models.  Examples  are  also  provided  for  performance  comparisons  between  the  LTI 
case  and  the  corresponding  (LPTV)m  one. 
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I.  INTRODUCTION 

It  is  well  known  that  if  a  continuous  time  signal  x(t)  is  cr-bandlimited,  then,  it  can  be  recovered  uniquely 

from  its  samples  x(nT)  as  long  as  T  <  tt/ct.  Extensions  of  the  lowpass  sampling  theorem  such  as  the  bandpass, 

non  uniform  and  derivative  sampling  theorems  can  be  found  in  [1].  Recently,  Walter  [2]  showed  that,  under  some 

conditions,  a  class  of  non  bandlimited  continuous  time  signals  can  be  reconstructed  from  uniformly  spaced  samples 

even  though  aliasing  occurs.  Vaidyanathan  and  Phoong  [3],  [4]  developed  the  discrete  time  version  of  Walter’s 

result  from  a  multirate  digital  filtering  perspective.  In  specific,  they  considered  the  class  of  non  bandlimited  signals 

that  can  be  modeled  as  the  output  of  a  single  interpolation  filter  (single  band  model)  as  in  Fig.  1  or  as  the  output  of 

the  more  general  multiband  model  of  Fig.  2.  The  filter  F(eju)  in  Fig.  1  and  the  filters  k  =  0, 1, ...,  L—  1, 

in  Fig.  2  are  usually  a  subset  of  L  synthesis  filters  in  an  M  channel  maximally  decimated  perfect  reconstruction 

filter  bank,  although  this  is  not  a  necessary  condition.  To  give  the  reader  a  flavor  of  the  major  ideas,  consider  for 

the  moment  the  single  band  model  of  Fig.  1.  The  discrete  time  signal  x(n)  is  the  output  of  an  interpolation  filter 

F(eju).  Even  though  this  signal  is  not  in  general  bandlimited,  it  is  natural  to  expect  that  it  can  be  recovered 

from  its  decimated  version  x(Mn).  To  see  this,  assume  that  x(n)  is  modeled  as  in  Fig.  1  and  consider  x(Mn), 

the  M-fold  decimated  versions  of  x(n).  If  F(eju )  is  a  Nyquist(M)  filter  [5],  then,  x{Mn)  is  equal  to  y(n)  and  we 

have  the  relation  x(n)  =  —  kM).  In  other  words,  x(n)  is  completely  defined  by  the  samples  x(Mn) 

k 

even  though  the  filter  F(eju;)  is  not  necessarily  ideal.  In  [4],  the  authors  consider  the  case  where  F(e?u)  is  not 
necessarily  a  Nyquist(M)  filter  and  show  how  similar  reconstruction  can  be  done.  They  also  consider  the  stability 
of  the  reconstruction  process.  It  turns  out  that  if  one  of  the  polyphase  components  of  F(eju)  is  free  from  unit 
circle  zeros,  then,  stability  of  reconstruction  is  guaranteed.  Furthermore,  even  if  all  the  polyphase  components  of 
F(e have  unit  circle  zeros,  stable  reconstruction  can  still  be  achieved  by  using  non  uniform  decimation.  In  this 
case,  a  sufficient  condition  for  stable  reconstruction  is  that  F(e^u)  (assumed  FIR)  has  two  polyphase  components 
with  no  multiple  zeros,  i.e.,  each  polyphase  component  has  distinct  zeros  and  they  do  not  share  any  common  zero. 

In  this  paper,  we  consider  the  efficient  quantization  of  this  class  of  non  band-limited  signals  that  can  be 
modeled  as  in  Fig.  1  or  more  generally  as  in  Fig.  2.  To  motivate  such  a  study,  consider  the  schematic  shown  in 
Fig.  3  where  the  box  labeled  Q  is  a  simple  uniform  roundoff  (PCM)  quantizer.  After  going  through  the  quantizer, 
the  signal  x{n)  is  now  contaminated  by  an  additive  noise  component  e(n).  Assuming  that  the  signal  x(n)  is 
bandlimited  or  equivalently  oversampled  (since  a  bandlimited  signal  can  be  further  downsampled),  we  can  low 
pass  filter  the  quantized  signal  x(n)  -fe(n).  The  ideal  low  pass  filter  on  the  right  removes  the  noise  in  the  stopband 
but  does  not  change  the  signal  component.  In  terms  of  signal  and  noise  power,  the  signal  power  remains  unchanged 
whereas  the  noise  power  decreases  proportionally  to  the  oversampling  ratio,  usually  expressed  in  the  form  2r.  It 
can  be  shown  that  for  every  doubling  of  the  oversampling  ratio,  i.e.,  for  every  unit  increment  in  r,  the  signal  to 
noise  ratio  (SNR)  improves  by  about  3  db  or  equivalently,  the  quantizer  resolution  improves  by  one  half  bit  (see 
for  example  [6]).  After  low-pass  filtering,  the  quantized  signal  can  be  downsampled  to  the  Nyquist  rate  without 
affecting  the  signal  to  noise  ratio.  The  idea  is  therefore  to  exploit  the  oversampled  nature  of  the  signal  x(n)  to 
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tradeoff  quantizer  complexity  for  higher  resolution.  This  technique  is  usually  called  oversampled  PCM  conversion. 
Consider  now  the  system  of  Fig.  4  where  P(ejw)  is  a  linear  time-invariant  (LTI)  filter.  The  input  signal  x(n)  is 
still  assumed  to  be  oversampled  (bandlimited).  In  addition  to  the  benefits  described  above,  it  can  be  shown  that 
this  more  sophisticated  system  produces  a  further  decrease  in  the  noise  power  by  “cleverly”  choosing  the  filter 
P(eju>)  in  Fig.  4.  The  filter  pair  P(eju)  and  1/P(eju)  does  not  modify  the  input  signal  x(n)  in  any  way  but  only 
affects  the  noise  component  e(n).  Similar  to  sigma-delta  quantizers,  the  system  of  Fig.  4  introduces  noise  shaping 
in  the  signal  band  to  allow  higher  resolution  quantization  of  bandlimited  signals. 

With  these  ideas  in  mind,  observe  now  the  output  x(n)  of  Fig.  1.  Even  though  x(n)  is  not  bandlimited,  it 
can  be  reconstructed  from  its  downsampled  version  as  explained  above.  In  this  sense,  it  can  be  considered  as  an 
oversampled  signal.  The  question  then  arises  :  Can  we  obtain  advantages  similar  to  the  above  schemes  for  a  non 
bandlimited  signal  satisfying  the  model  of  Fig.  1  and  more  generally  of  Fig.  2  ?  Furthermore,  for  a  fixed  set  of 
filters  F(eju)  (or  Fk(eju),  k  =  0, 1,. . .  ,L  -  1,),  what  is  the  best  filter  prefilter  P{e?u)  that  minimizes  the  noise 
power  at  the  output  ?  Do  we  gain  more  by  using  a  more  general  postfilter  V (e>u )  instead  of  y  ?  This  is 
a  sample  of  the  type  of  questions  we  answer  in  this  paper.  Indeed,  we  will  show  that,  by  replacing  the  ideal  low 
pass  filter  with  the  correct  multirate  reconstruction  system,  we  can  reap  the  same  quantization  advantages  as  in 
the  bandlimited  case.  As  a  simple  example,  consider  the  scheme  of  Fig.  5  where  the  finite  order  filter  F(eJ“’)  is 
such  that  its  magnitude  squared  response  |J^(e^)|2  is  Nyquist(M),  that  is,  (|F(e^)|2)  4 -M=  1  (we  will  motivate 
such  an  assumption  later  in  the  paper).  With  this  assumption,  it  can  be  shown  that  the  signal  x(n)  in  Fig.  5  is 
equal  to  x{n)  in  the  absence  of  the  quantizer  and  that  the  entire  scheme  of  Fig.  5  behaves  similarly  to  Fig.  3, 
except  that  the  low  pass  filtering  is  now  multirate  and  non  ideal.  Thus,  generally  speaking,  if  a  non  bandlimited 
signal  can  be  reconstructed  from  its  samples  x(Mn)  because  it  satisfies  a  model  like  Fig.  1,  then,  a  low  precision 
quantizer  should  allow  us  to  produce  a  high  precision  version  x(n). 

To  bring  the  analogy  closer  to  the  scheme  of  Fig.  4,  we  should  introduce  noise  shaping.  This  can  be  done  by 
using  a  pre-  and  post  filter  before  and  after  the  quantizer  respectively  as  shown  in  Fig.  6.  The  prefilter  P{eiu)  is 
traditionally  an  integrating  low  pass  filter.  The  post  filter  1  /P(e?u)  shapes  the  noise  spectrum  in  order  to  further 
decrease  the  noise  variance.  In  this  paper,  we  will  derive  closed  form  expressions  for  the  optimal  choice  of  P(e?  ) 
and  the  minimum  average  mean  square  error  obtained  from  such  a  scheme.  Several  extensions  to  the  above  noise 
shaping  idea  are  then  introduced.  For  example,  we  relax  the  requirement  that  the  postfilter  is  the  inverse  of  the 
prefilter  and  assume  a  more  general  postfilter  V(e?u).  Closed  form  expressions  for  the  optimum  filters  in  this  case 
and  the  minimum  mean  squared  error  are  also  derived.  We  would  like  to  warn  the  reader  at  this  point  that  no 
optimization  of  finite  order  filters  is  performed  in  this  paper.  The  emphasis  is  actually  to  find  an  expression  for 
the  theoretically  optimum  filters  (without  order  constraint)  to  get  an  upper  bound  on  the  achievable  gain  with 
practical  inexpensive  filters. 

The  quantization  advantage  offered  by  Fig.  5  and  Fig.  6  can  be  useful,  for  example,  in  the  following  realistic 
engineering  scenario.  Suppose  x(n)  is  generated  at  a  point  where  we  cannot  afford  very  complex  signal  processing 
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(e.g.,  in  deep  space)  and  needs  to  be  transmitted  to  a  distant  place  (e.g.,  earth  station).  If  we  have  the  knowledge 
that  x(n)  admits  a  satisfactory  model  like  Fig.  1,  we  can  compress  it  using  a  very  simple  low  pass  filter  P(eju;) 
with  one  or  two  multipliers  and  then  quantize  the  output  before  transmission.  The  post  filter  1  /P(ejuJ)  and  the 
expensive  multirate  filter  are  at  the  receiver  end,  where  the  complexity  is  acceptable. 

Assume  now  that  the  main  aim  is  to  obtain  a  reduction  in  the  bit  rate  (number  of  bits  per  second)  rather  than 
accuracy  (number  of  bits  per  sample).  If  we  are  allowed  to  perform  discrete  time  filtering  (of  arbitrary  complexity), 
we  will  see  that  the  best  approach  would  be  as  in  Fig.  7.  In  this  set  up,  we  first  generate  the  driver  signal  y(n) 
and  then  quantize  it.  The  signal  x(n),  which  is  equal  to  x(n)  in  absence  of  quantization,  is  then  generated.  The 
lower  rate  signal  y(n)  in  Fig.  7  can  be  regarded  as  the  principal  component  signal  in  an  orthonormal  subband 
coder.  We  will  see  throughout  this  paper  that,  by  choosing  this  type  of  quantization  system,  we  can  obtain  a  large 
reduction  in  the  bit  rate  and/or  the  quantization  accuracy  depending  on  the  particular  signal  model. 

Summarizing,  the  main  issue  in  this  paper  is  how  to  take  advantage  of  the  signal  model  (Fig.  1  or  Fig.  2)  in 
preparing  a  quantized  or  compressed  version  of  x(n).  Our  study  is  motivated  by  similar  concepts  that  arises  in 
A/D  conversion  applications.  We  find  that  the  choice  of  a  particular  scheme  depends  on  how  much  processing  we 
are  allowed  to  do  before  quantization.  If  processing  is  allowed,  we  first  generate  y(n)  by  filtering  and  decimation 
and  then  quantize  it.  Otherwise,  we  quantize  x(n)  directly  and  then  filter  the  quantized  signal  with  the  appropriate 
multirate  scheme.  Noise  shaping  can  be  also  introduced  to  obtain  better  resolution.  In  any  case,  an  improvement 
in  accuracy  and/or  bit  rate  due  to  the  signal  model  is  always  achieved. 


1. 1 .  Main  results  and  outline  of  the  Paper 

1.  In  section  II,  definitions  and  well  established  facts  of  various  multirate  and  statistical  signal  processing  concepts 
used  throughout  the  paper  are  reviewed. 

2.  In  section  III,  new  results  that  describe  the  statistical  behavior  of  signals  as  they  pass  through  multirate 
interconnections  are  presented.  These  results  will  then  be  used  to  derive  the  theorems  of  interest  of  the  paper. 

3.  In  section  V,  we  give  several  results  on  the  quantization  of  the  non  bandlimited  signal  x(n)  modeled  as  in 
Fig.  1.  The  signal  x(n)  is  first  quantized  to  an  average  of  b  bits  per  sample  and  then  filtered  by  the  multirate 
interconnection  in  Fig.  5.  We  show  that  the  multirate  system  does  not  affect  the  signal  component  but  reduces  the 
noise  variance  by  a  factor  of  M.  This  amounts  to  the  same  quantitative  advantage  obtained  from  the  oversampling 
PCM  technique  (0.5  bit  reduction  per  doubling  of  the  oversampling  ratio). 

4.  In  section  VI,  the  lower  rate  signal  y(n)  is  quantized  instead  of  x(n).  By  quantizing  y(n)  to  b  bits  per  sample, 
the  quantization  bit  rate  (number  of  bits  per  second)  is  decreased  by  a  factor  of  M  but  noise  reduction  due  to 
multirate  filtering  is  now  not  possible. 

5.  In  section  VII,  noise  shaping  is  introduced 'in  order  to  obtain  better  accuracy.  First,  we  consider  the  use  of  pre- 
and  post  linear  time  invariant  filters  P{e?u)  and  ^  ^  as  in  Fig.  6  together  with  a  fixed  time  invariant  quantizer 
Q.  For  this  case,  the  optimum  filter  Popt(eju)  that  minimizes  the  quantization  noise  variance  in  the  reconstructed 
output  x(n)  is  derived  and  a  closed  form  expression  for  the  average  minimum  mean  square  error  is  obtained.  We 


3 


then  consider  the  more  general  pre-  and  postfilters  P(eju)  and  V(ejw)  as  in  Fig.  8.  Closed  form  expressions  for 
the  optimum  filters  and  the  average  minimum  mean  square  error  are  also  found  for  this  case. 

6.  In  section  VIII,  we  replace  the  linear  time  invariant  filter  P(e>u)  with  a  more  general  linear  periodically  time 
varying  filter  of  period  M.  This  is  motivated  by  the  cyclo-widesense  stationarity  of  x(n).  Since  the  problem  of 
finding  the  optimum  general  (LPTV)m  filter  (equivalently  biorthogonal  filter  bank)  is  analytically  difficult  to 
track,  optimal  solutions  are  given  for  two  special  cases  of  ( LPTV)M  filters.  The  first  solution  is  for  the  set  of  M 
filters  Vk(eju)  shown  in  Fig.  9.  The  filters  Vk(ejul)  and  —  j-r-j  act  as  pre-  and  post  filters  for  the  kth  subband 
quantizer.  The  second  solution  is  for  the  case  of  an  orthonormal  filter  bank  or  equivalently  for  a  lossless  (LPTV)m 
filter.  The  scheme  is  shown  in  Fig.  10  for  the  single  band  case. 

7.  All  the  results  mentioned  above  are  also  generalized  for  the  multiband  case.  Furthermore,  examples  are  provided 
whenever  necessary  for  illustrative  purposes. 

II.  SUMMARY  OF  STANDARD  MULTIRATE  CONCEPTS 

1.  Notations .  Lower  case  letters  are  used  for  scalar  time  domain  sequences.  Upper  case  letters  are  used  for 

transform  domain  expressions.  Bold  faced  quantities  represent  vectors  and  matrices.  The  superscripts  T,  *  and 
f  denote  respectively  the  transpose,  conjugate  and  the  conjugate  transpose  operations  for  vectors  and  matrices. 
The  M-fold  downsampler  has  an  input-output  relation  y(n)  =  x(n)  \.M  =  x(Mri).  The  M-fold  expander’s  input- 
output  relation  is  y(n)  =  x(n)  =  x(n/M)  when  n  =  multiple  of  M  and  y(n)  =  0  otherwise.  The  M-fold 
polyphase  representation  of  X (e^ )  is  given  by  X (eJW)  =  A’o(ejM‘J)  4-  e~iu X\  (ejMu)  4-  4- ...  4- 

e~ The  polyphase  components  are  given  by  xk(n)  =  x(Mn  +  k)  or,  in  the  frequency  domain 
by  Xk(eP)  ={efukX(etu))\M.  The  tilde  accent  on  a  function  F(z)  is  defined  such  that  F (z)  is  the  conjugate 
transpose  of  F(z),  i.e.,  F(z)  =  F*(l /z*). 

2 .  Blocking  a  signal .  Given  a  scalar  signal  x(n ),  we  define  its  M-fold  blocked  version  x(n)  by 

x(n)  =  (i(nM)  x(nM  —  1)  ...  x(nM  —  M  +  1)  )T  (1) 

Equivalently,  the  scalar  sequence  x(n)  is  called  the  unblocked  version  of  the  vector  process  x(n).  The  blocking 
and  unblocking  operations  are  shown  in  Fig.  11.  The  elements  of  the  blocked  version  x(n)  are  the  polyphase 
components  of  x(n). 

3 .  Cyclo-widesense  stationary  process .  A  stochastic  process  x(n)  is  said  to  be  cyclo-widesense  stationary 
with  period  M,  abbreviated  as  ( CWSS)m ,  if  the  M-fold  blocked  version  x(n)  is  WSS.  Alternatively  [7],  [8],  a 
process  x{n)  is  (CWSS)m  if  the  mean  and  autocorrelation  functions  of  x(n)  are  periodic  with  period  M,  i.e., 

E[x{n )]  =  E[x(n  -h  kM)\  V  n,k  and  RXx(n,  k)  =  Rxx{n  +  M,  k)  V  n,  k.  (2) 

where  Rxx(n ,  k )  =  E[x{n)x*(n  —  A;)]  is  the  autocorrelation  function  of  x(n). 

4.  Antialias(M )  filters »  F(ejuJ)  is  said  to  be  an  antialias(M)  filter  if  its  output  can  be  decimated  M-fold  without 
aliasing,  no  matter  what  the  input  is.  Equivalently,  there  is  no  overlap  between  the  plots  F(e^UJ~^2nk^M^)  for 
distinct  fcinO<A;<M  —  1.  Since  this  requires  a  stopband  with  infinite  attenuation,  these  are  ideal  filters. 
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5.  Orthonormal  filter  bank.  An  M-channel  maximally  decimated  uniform  filter  bank  (FB)  is  said  to  have  the 
perfect  reconstruction  (PR)  property  when  R(eju)  =  E~l(eju)  where  E(eJu;)  and  R(e-7"a7)  denote  respectively  the 
analysis  and  synthesis  polyphase  matrices  [8].  In  the  case  of  an  orthonormal  filter  bank,  the  analysis  polyphase 
matrix  is  paraunitary,  i.e.,  E(eJu;)Et (eju)  =  I  V  u  and  we  choose  R(eJUJ)  =  E*(eju)  for  perfect  reconstruction. 
The  analysis  and  synthesis  filters  are  related  by  i**(eJtJ)  =  R*(e J’w),  that  is  /*(n)  =  h*k(—n).  It  follows  that,  for 

/7T  d(jJ 

|F*(eJW)|2—  =  1. 

-7T  ^ 

6.  The  coding  gain  of  a  system.  Assume  that  we  quantize  x(n)  directly  with  b  bits  as  shown  in  Fig.  12.  We 
denote  the  corresponding  mean  square  error  (m.s.e)  by  £direct>  We  then  use  the  optimum  pre  and  post  filters  (in 
the  mean  square  sense)  around  the  quantizer.  With  the  rate  of  the  quantizer  fixed  to  the  same  value  b,  we  denote 
the  minimum  m.s.e  in  this  case  by  Smin *  The  ratio  Street /£min  is  called  the  coding  gain  of  the  new  system  and, 
as  the  name  suggests,  is  a  measure  of  the  benefits  provided  by  the  pre/post  filtering  operation. 

Ill .  PRELIMINARY  RESULTS 


Result  1.  Consider  any  L  synthesis  filters  ( L  <  M)  of  an  M-channel  orthonormal  Biter  bank  as  shown  in  Fig. 
2.  Assume  that  the  L  inputs  j/fc(n)  to  the  synthesis  filters  Fk(efu)  are  zero  mean  jointly  WSS  processes ,  not 
necessarily  uncorrelated.  Then ,  the  statistical  correlation  (averaged  over  M  samples)  between  the  interpolated 
subband  signal  Xi(n)  and  the  M-sample  shifted  process  xj(n  —  Mm)  is  zero ,  for  all  values  of  i  ^  j  and  m,  that  is: 


M—l 


M 


k=0 


E[xi(n  —  k)xj*(n  —  k  —  Mm)]  =  0,  V  n,m  and  V  i,j  £  [0 ,L  —  1] 


(3) 


The  proof  can  be  found  in  appendix  A.  As  a  consequence,  the  average  variance  of  the  (CWSS)m  output  process 
x(n)  of  Fig.  2,  where  the  filters  Fk(ejuJ)  are  any  L  synthesis  filters  of  an  M-channel  orthonormal  filter  bank,  is: 


1 7 


2 

x 


Vk 


(4) 


^  M—l 

This  can  be  seen  by  substituting  x(n)  in  the  formula  <j\  —  —  ^  E[\x(n)\2]  and  using  result  1  for  the  special  case 


n=0 


of  m  =  0  and  n  =  M  —  l.  If  the  L  inputs  to  the  synthesis  filters  F*.(eJa;)  are  zero  mean  uncorrelated  WSS  processes, 
the  previous  result  holds  without  the  orthonormality  requirement  on  the  filters  FJfc(eJu;),  k  =  0, 1, . . . ,  L  —  1. 


Result  2.  Consider  the  multirate  interconnection  of  Fig.  1  where  the  input  y(n )  is  zero  mean  WSS  random 
process.  If  F(e^u)  is  a  filter  (not  necessarily  ideal)  with  a  Nyquist(M)  magnitude  squared  response,  then 


(5) 


where  <j\  is  the  average  variance  of  the  (CWSS)m  output  x(n). 

Proof.  While  this  is  a  special  case  of  the  above  with  L  =  1,  the  following  proof  is  direct  and  more  instructive. 
With  F(eju)  expressed  in  terms  of  its  polyphase  components  Rk(eju),  Fig.  1  can  be  redrawn  as  in  Fig.  13.  The 
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signal  x(n)  is  the  interleaved  version  of  the  WSS  outputs  of  Rk(e%  So,  it  has  zero  mean  and  a  variance  which 
is  periodic  with  period  M.  The  average  variance  is  given  by  : 


1  M—  1  i  /*  7T  M  —  l  1 

*=o  fc=o 


(6) 


M— 1 


The  Nyquist  property  of  |F(eJW)|2  implies  in  particular  that  ^  \Rk(eju>)\2  —  1  (see  [5]  pp.  159).  The  preceding 

k= o 

equation  therefore  simplifies  to  ex2  =  J  Syy(eju)—  ~  " 


IV.  FILTER  AND  QUANTIZER  ASSUMPTIONS 

Filter  assumptions.  The  filters  F(e^)  of  Fig.  1  and  Fk(e^),  k  =  0, 1, . . .  ,L  -  1,  of  Fig.  2  are  assumed  to  be 
the  synthesis  filters  of  any  L  channels  of  an  M-channel  maximally  decimated  orthonormal  filter  bank.  Although 
not  necessary  for  developing  the  results  of  this  paper,  we  will  additionally  choose  the  L  channels  of  the  M-channel 
maximally  decimated  orthonormal  filter  bank  to  be  the  most  dominant  ones  in  terms  of  subband  energy.  The 
model  filters  are  therefore  the  so-called  optimum  energy  compaction  filters.  This  last  constraint  is  motivated  by 
the  fairly  recent  result  that  this  particular  choice  of  filters  minimize  the  mean  square  reconstruction  error  between 
the  original  signal  x(n)  and  its  approximation  x(n)  [9],  [10].  We  would  like  however  to  emphasize  that,  unlike 
previous  work,  the  filters  in  this  paper  are  assumed  to  be  of  finite  order.  Working  with  ideal  brick  wall  filters  will 
obviously  contradict  the  non-bandlimited  assumption. 

Quantizer  assumption.  As  a  convention  for  this  paper,  the  box  labeled  Q  represents  a  scalar  uniform  (PCM) 

quantizer  and  is  modeled  as  an  additive  zero  mean  white  noise  source  q(n).  Because  the  model  filters  are  not  ideal, 

the  input  x(n)  is  a  zero  mean  ( CWSS)M  process.  Since  the  input  to  the  quantizer  x(n)  is  a  (CWSS)M  process, 

its  variance  <r2(n)  is  a  periodic  function  of  n  with  period  M.  Define  it2  to  be  the  average  variance  of  x(n) ,  i.e., 

CT2  =  J_  V'  cr2(n)  Then,  choose  the  fixed  step  size  A  in  the  uniform  quantizer  such  that  the  quantization  noise 
*  M  ■“  1 

variance  (Tg  is  directly  proportional  to  the  average  variance  of  the  quantizer  input  x{n),  that  is 

cr2=c2-2V2  (7) 

where  <r2  is  the  quantization  noise  variance,  c  is  a  constant  that  depends  on  the  statistical  distribution  of  x{ri) 
and  the  overflow  probability,  and  o\  is  the  average  variance  of  the  quantizer  input.  The  above  relation  is  justified 
for  a  PCM  quantizer  using  3  (or  more)  bits  per  sample  (see  chapter  4  in  [11]).  If  the  input  to  Q  is  wide-sense 
stationary,  the  above  relation  holds  with  <r2  now  denoting  the  actual  variance  of  the  WSS  process. 

V.  INCREASING  THE  QUANTIZER  RESOLUTION  BY  MULTIRATE  FILTERING 

Consider  the  set  up  shown  in  Fig.  5  for  the  single  band  model  and  in  Fig.  14  for  the  multiband  case.  In  the 
absence  of  the  quantization,  the  two  schemes  are  perfect  reconstruction  systems.  In  the  presence  of  the  quantizer, 
the  output  x(n)  in  Fig.  5  and  Fig.  14  is  equal  to  the  original  sequence  x(n)  plus  an  error  signal  e(n)  due  to 
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quantization.  The  following  result  shows  that,  by  using  the  above  schemes,  a  significant  reduction  in  the  average 


M- 1 


mean  square  error  £  =  —  ^  £J{e(n)}2  can  be  obtained  in  comparison  with  the  direct  quantization  of  x(n) 

n— 0 

shown  in  Fig.  12. 

Theorem  5.1.  Consider  the  scheme  of  Fig.  14  where  the  L  filters  Fk(eju)  are  assumed  to  be  any  L  channels 

of  an  M -channel  critically  sampled  orthonormal  filter  bank.  Under  the  above  quantization  noise  assumption ,  the 

a  1  M~1  L 

average  mean  square  error  (m.s.e)  E  —  —  ^  E{x(n)  -  x(n)}  is  equal  to  — 0^. 


n= 0 


Proof.  Because  the  system  is  a  perfect  reconstruction  one,  the  average  error  at  the  output  is  due  only  to  the 
quantization  noise.  The  quantization  noise  q(n)  is  white  and  propagates  through  the  L  channels  of  Fig.  14.  For 
the  kth  channel,  the  variance  of  u*(n)  due  to  the  noise  passage  through  Fk(eju>)  is  given  by: 


(8) 


The  second  equality  follows  because  the  filters  have  unit  energy.  The  downsampling  operation  does  not  alter  the 


variance  of  a  signal.  We  therefore  obtain  =  cr2Uk  =  a2  for  all  k.  Using  result  1  of  section  III,  we  can  write 


'■iW 

*=0 


La 2 


(9) 


For  the  scheme  of  Fig.  5,  the  average  m.s.e.  E  can  be  obtained  directly  by  setting  L  —  1  and  is  therefore  equal 

to  —  ai.  The  quantization  noise  variance  ai  obtained  by  directly  quantizing  x(n)  as  shown  in  Fig.  12  is  now 
M  q  q 

reduced  by  the  oversampling  factor  M.  The  signal  variance  a\  on  the  other  hand  did  not  change.  By  expressing 
the  interpolator  M  in  the  form  2r,  we  can  immediately  see  that  we  can  get  the  same  quantitative  advantage  of  the 
oversampling  PCM  technique,  namely,  an  increase  in  SNR  by  3  db  for  every  doubling  of  the  oversampling  factor. 
For  example,  for  the  single  band  case  of  Fig.  5,  if  M  =  2,  then,  we  get  an  SNR  increase  of  3  db  whereas  if  M  —  4, 
the  SNR  increment  is  by  6  db.  Some  important  remarks  are  in  order  at  this  point  : 

1,  In  the  oversampling  PCM  technique,  the  quantized  bandlimited  signal  is  typically  downsampled  after  the  low 
pass  filter  [6].  The  SNR  before  and  after  the  downsampler  is  the  same  and  the  increase  in  SNR  is  only  due 
to  a  reduction  in  noise  power.  Similarly,  the  SNR  before  and  after  the  interpolation  filter  in  Fig.  5  does  not 
change.  However,  the  reason  for  the  SNR  increase  before  the  interpolation  filter  is  different  from  the  one  after  the 
interpolation  filter.  In  specific,  at  the  input  of  the  interpolation  filter,  the  signal  variance  increases  proportionally 
to  M  since  cr*  =  Mcrl  and  the  noise  power  remains  fixed.  At  the  output  of  the  interpolation  filter,  the  signal 
variance  doesn’t  change  but  the  noise  power  decreases  in  proportion  to  M.  In  both  cases,  this  amounts  to  the  same 
SNR  improvement.  This  last  technical  difference  arises  because  our  study  assumes  a  statistical  framework  rather 
than  a  deterministic  one  (typical  in  A/D  conversion  applications)  and  because  of  our  quantizer  assumptions. 

2.  Intuitive  explanation  of  theorem  5.1.  The  signal  z(n),  modeled  either  as  in  Fig.  1  or  Fig.  2,  is  oversampled 
and  therefore,  contains  redundant  information  in  the  form  of  an  excess  of  samples.  It  is  by  quantizing  these  extra 
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samples  that  we  obtain  the  reduction  in  the  quantization  noise  variance  (equivalently  in  the  mean  square  error). 
We  are  therefore  effectively  quantizing  with  a  higher  number  of  bits  per  sample.  This  trade  off,  between  the 
quantization  noise  variance  (effective  quantizer  resolution)  and  the  sampling  rate  is  the  underlying  principle  of 
oversampled  A/D  converters. 

3.  The  role  of  the  factor  L  in  this  analysis.  The  parameter  L,  defined  to  be  the  number  of  channels  in  the 
multiband  case,  alternates  between  two  extremes  :  L  =  1  and  L  =  M.  When  L  =  1,  we  get  the  best  SNR 
improvement  at  the  expense  of  a  more  narrow  class  of  inputs  x(ri).  When  L  —  M ,  it  is  clear  from  (9)  that  no  noise 
variance  reduction  is  achieved  since  the  class  of  signals  is  now  unrestricted.  We  can  also  see  this  by  noticing  that 
the  multirate  interconnection  in  Fig.  14  becomes  a  perfect  reconstruction  filter  bank  that  is  signal  independent. 
The  parameter  L  therefore  determines  the  tradeoff  between  the  generality  of  the  class  of  signals  x(n)  and  the 
reduction  in  quantization  noise  variance. 

4.  A  cascade  of  the  scheme  of  Fig.  5  does  not  provide  any  further  gain.  Using  the  scheme  of  Fig.  5,  we  obtained 
a  reduction  in  noise  by  a  factor  M.  If  we  use  a  cascade  of  the  same  filtering  scheme  as  in  Fig.  15,  no  further 
noise  reduction  is  obtainable.  Using  the  polyphase  identity  [5]  and  keeping  in  mind  that  |F(eJ’“')|2  is  Nyquist(M), 
the  product  filter  F(eJ<J)F(eJfc’)  together  with  the  expander  and  decimator  reduces  to  an  identity  system.  Fig.  15 
therefore  simplifies  to  Fig.  5  and  the  average  m.s.e  is  the  same. 

VI.  QUANTIZING  AT  LOWER  RATE 

A  consequence  of  the  previous  results  and  discussion  is  then  the  natural  question:  what  if  the  discrete  time 
filtering  of  the  oversampled  signal  is  not  a  major  burden  ?  If  we  know  that  x(n)  can  be  modeled  quite  accurately 
by  the  filter  F(eja)  of  Fig.  1  or  the  filters  Fk(ejuJ),  k  =  0, 1, . . . ,  L  -  1,  of  Fig.  2,  we  filter  and  downsample  x(n) 
accordingly  to  obtain  either  y(n)  or  yk(n),  k  =  0, 1, . . . ,  L  -  1.  The  quantization  systems  for  the  two  models  are 
shown  in  Fig.  7  and  Fig.  16  respectively.  We  can  then  in  principle  quantize  the  decimated  signal  y{n)  in  Fig. 
7  with  b  =  Mb  bits  per  sample  or  the  signals  yk{n),  k  =  0, 1, ...  ,L  -  1,  of  Fig.  16  with  an  average  number  of 
bits  per  sample  b=^-b  bits.  This  situation  is  equivalent  to  fixing  the  bit  rate  (number  of  bits  per  second)  to  be 
equal  to  b  in  order  to  trade  quantization  resolution  with  sampling  rate.  Moreover,  for  the  multiband  case,  we  can 
allocate  bits  6*  to  the  driving  signals  2/*(n)  in  an  “appropriate”  manner.  At  this  point,  we  will  however  assume 
that  the  goal  is  to  actually  obtain  a  reduction  in  the  bit  rate.  To  achieve  this,  we  let  b  be  equal  to  b  for  both  cases 
and  analyze  the  quantization  systems  of  Fig.  7  and  Fig.  16  under  this  condition.  By  fixing  the  number  of  bits 
per  sample  and  decreasing  the  signal  rate,  the  bit  rate  will  automatically  decrease  by  M/L.  However,  since  the 
quantizer  resolution  did  not  increase,  the  quantization  noise  variance  should  not  differ  from  the  direct  quantization 
case  of  Fig.  12.  This  last  statement  is  verified  formally  in  the  next  theorems. 

Theorem  6.1.  Consider  the  scheme  of  Fig,  7 .  Using  a  fixed  number  of  bits  per  sample  b  to  quantize  y(n)7  the 
average  mean  square  error  £  is  equal  to  where  (j\  is  the  noise  variance  obtained  from  directly  quantizing  x(n) 
using  b  bits  per  sample. 
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Proof.  Let  cr2  be  the  noise  variance  of  Fig.  12  and  £  be  the  average  mean  square  error  of  Fig.  7.  Using  (7), 
we  can  write  a\  =  c2“26cr2.  But,  by  result  2  of  section  III,  Z  =  --^c2“26cr2  =  ^c2~2hMa\  =  er2,  where  er2  is  the 
average  variance  of  x(n).  ■ 

The  theorem  indicates  that,  for  the  single  band  model  and  under  a  fixed  number  of  quantizer  bits  b,  quantizing 
the  lower  rate  signal  y(n)  is  as  accurate  as  directly  quantizing  x(n).  This  is  expected  and  is  in  fact  consistent  with 
the  observation  of  section  V  regarding  the  tradeoff  between  the  average  m.s.e  due  to  quantization  and  the  rate  of 
the  signal.  The  next  theorem  for  the  multiband  case  gives  a  similar  conclusion. 


Theorem  6.2.  Consider  the  scheme  of  Fig.  16.  Assume  that  we  quantize  yk{n)  b  bits  per  sample  for  all 
k.  Then,  the  average  mean  square  error  £  is  equal  to  cr2,  where  cr2  is  the  noise  variance  obtained  from  directly 
quantizing  x(n)  using  b  bits  per  sample. 


Proof.  The  average  mean  square  error  at  the  output  of  Fig.  16  is  equal  to 


k=0 


k=0 


(10) 


where  b  denotes  the  fixed  number  of  bits  allocated  to  the  kth  channel  quantizer.  The  noise  variance  a 2  in  Fig.  12 
is  equal  to  c2~26<72,  which  in  turn  is  equal  to  (10).  ■ 


VII .  NOISE  SHAPING  BY  TIME-INVARIANT  PRE -  AND  POST  FILTERS 

Following  the  philosophy  of  sigma-delta  modulators,  we  now  perform  noise  shaping  to  achieve  a  further 
reduction  in  the  average  mean  square  error.  To  accomplish  this,  we  propose  using  LTI  pre-  and  post  filters  around 
the  PCM  quantizer  as  shown  in  Fig.  6  for  the  single  band  model  and  in  Fig.  17  for  the  multiband  model.  We 
first  use  a  prefilter  P(e^u)  and  assume  that  the  postfilter  is  its  inverse.  We  then  relax  this  condition  and  assume 
a  more  general  postfilter  V(eju).  The  goal  is  to  optimize  these  filters  such  that  the  average  m.s.e  at  the  output 
of  either  quantization  system  is  minimized.  The  noise  shaping  filters  to  be  optimized  are  not  constrained  to  be 
rational  functions  (i.e.,  of  finite  order)  and  non  causal  solutions,  for  example,  are  accepted. 

Although  our  quantizer  design  assumptions  are  the  same  as  before,  the  quantizer  input  is  not  anymore  the 
(CWSS)m  process  x(n),  but  a  filtered  version  of  it,  which  we  denote  by  z(n).  Following  (7),  the  noise  vari¬ 
ance  in  this  case  is  given  by  cr2  =  c2_26cr2  where  cr2  is  the  average  variance  of  the  process  z{n).  We  emphasize 
that  z(n)  is  a  (CWSS)m  process  since  the  output  of  a  linear  time  invariant  filter  driven  by  a  (CWSS)m  process 
is  also  (CWSS)m  [8].  It  is  then  possible  to  express  er2  in  terms  of  the  prefilter  P(eju)  and  the  so  called  average 
power  spectral  density  (see  below)  of  the  process  x(n),  denoted  by  5xx(eJW),  as  follows  : 

crl=±jjP(^)\2Sxx(e^  (11) 

The  proof  of  (11)  can  be  found  in  appendix  C.  The  average  power  spectral  density  is  a  familiar  concept  that 
arises  when  “stationarizing”  a  (CWSS)m  process  [12], [13], [14]  and  satisfies  the  well  known  properties  of  the 
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power  spectrum  of  a  WSS  process.  It  is  defined  to  be  the  discrete  time  fourier  transform  of  the  time  averaged 

j  M- 1 

autocorrelation  function  Rxx(k)  given  by  —  ^  E[x(n)x*(n  -  *)].  Another  interpretation  of  the  average  power 
spectral  density  which  can  be  physically  more  appealing  is  based  on  the  concept  of  phase  randomization  and  is 
reviewed  in  appendix  B.  Finally,  if  x(n )  is  modeled  as  in  Fig.  1,  it  can  be  shown  that  : 

Sxx(ej“)  =  ^Syy(e^M)\F(en\2  (12) 

whereas  if  the  signal  satisfies  the  multiband  model  of  Fig.  2,  the  average  power  spectral  density  takes  the  following 
form  : 

Sxx(en  =  -^FV'w)Sy(e^M)F(e^)  (13) 

where  F(eju)  =  ( F0(eju)  Fi(eju)  . . .  FL.x{e^)  )T  and  Sy(eju)  is  the  L  x  L  power  spectral  density  matrix 

of  the  L  WSS  inputs  ykin).  Note  that,  when  the  signals  Dkiji)  are  uncorrelated,  equation  (13)  simplifies  to 

1 

—  ^2  Syk(eiwM)\Fk(eju,)\2 .  The  proofs  of  (12)  and  (13)  are  given  in  appendix  D.  The  expression  (12)  was 
M  k= o 

derived  previously  in  [8]  for  the  special  case  where  F(&a)  is  an  anti-alias(M)  filter.  Furthermore,  the  authors 
prove  that  the  output  process  x(n)  is  WSS  if  and  only  if  F(ejoJ)  is  an  anti-alia s(M)  filter.  In  summary,  the 
statistical  properties  of  the  output  x(n)  of  Fig.  1  depend  on  F(e?u).  If  the  filter  is  an  anti-alias(M)  filter,  then, 
x(n)  is  WSS  with  a  power  spectral  density  Sxx(eju)  in  the  same  form  as  (12).  Otherwise,  x(n)  is  a  (CWSS)m 
process  and  in  this  case,  the  average  power  spectral  density  Sxx (eJCJ)  is  given  by  (12). 

7. 1.  Case  where  the  postfilter  is  the  inverse  of  the  prefilter 


Theorem  7.1.1.  Consider  the  scheme  of  Fig.  17  under  the  same  assumptions  of  section  IV.  The  optimum  prefilter 
P(eiu)  that  minimizes  the  average  mean  square  reconstruction  error  has  the  following  magnitude  squared  response: 


\PoPt(en\2  = 


\AEi~o1 1  fi(e^)P) 


(14) 


Proof.  We  first  observe  that  in  the  absence  of  quantization,  the  system  of  Fig.  17  is  a  perfect  reconstruction 

^  M—l 

system.  Therefore,  the  average  mean  square  reconstruction  error  E[x(n)  —  x(n)]2  at  the  output  is 


n= o 


due  only  to  the  noise  signal.  Let  Vk{n)  be  the  filtered  noise  component  in  the  kth  channel  of  the  L-channel  filter 
bank  of  Fig.  17.  The  variance  of  this  signal  is  equal  to 


2  =  fn  ol  Fh(eP)\*du 
Vk  Ln  q  |P(e^)|2  2tt 


(15) 


Since  the  downsampling  operation  does  not  change  the  variance  of  a  process,  we  can  write 

e  M  °Vk  q  M  J_n  |P(e^)|2  2rr 

k— 0 


(16) 


Using  (7)  and  (11),  we  get 


°e  = 


c2 


—26 


M 


r 

J  —  7T 


Sxx(en\P{enf 


^L-i 


to  r  E,~n  \Fk(e^)\2  dw 
J-n  |P(eJ")|2  2tt 


(17) 
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To  find  the  optimum  prefilter  P(eju ),  we  apply  Cauchy-Schwartz  inequality  to  (17)  to  obtain: 


f7T 

/t2  ^  c 

ae~~M 


)  r7r 

</-, 


\ 


L— 1 


du)s 


5II(ei-)(^|Fi(e^)|2)^) 


*=0 


(18) 


Since  this  lower  bound  is  independent  of  P(eju),  it  is  indeed  the  required  minimum  and  is  achieved  iff 


(19) 


which  gives  (14).  ■ 

A  number  of  observations  should  be  made  at  this  point.  First,  the  optimum  filter  is  not  unique  since  the  phase 
response  is  not  specified.  Second,  the  above  derivation  assumes  that  the  input  average  spectrum  Sxx(eju)  ^  0  for 
all  u>.  The  assumption  is  a  reasonable  one  because  x(n)  is  assumed  to  be  non  bandlimited  and  therefore  Sxx(eju) 
cannot  be  identically  zero  on  a  segment  of  [0,27r).  If  Sxx(eju)  has  an  isolated  zero  for  some  w,  then,  the  resulting 
prefilter  will  have  a  zero  on  the  unit  circle  and  is  therefore  unstable.  In  any  case,  a  practical  system  would  use 
only  a  stable  rational  approximation  of  the  ideal  solution.  Finally,  we  note  that  the  optimum  filter  for  the  scheme 
of  Fig.  6  can  be  obtained  again  as  a  special  case  by  setting  L—  1  in  (14).  The  optimum  prefilter  will  then  have 
the  following  magnitude  squared  response: 

\Popt(en\2  =  ■  l,F(eJ")l  (20) 

ySxx(eiu) 

and  can  be  regarded  as  a  multirate  extension  of  the  half  whitening  filter  [11].  Using  (20),  we  can  derive  an 
interesting  expression  for  the  coding  gain  of  the  scheme  of  Fig.  6. 


Theorem  7.1.2.  With  the  optimum  choice  of  the  pre-  and  post  filter,  the  coding  gain  expression  for  the  scheme 
of  Fig.  6  is 


Gopt  —  ~  “  2  “  ^ 

where  Qhw  is  the  half  whitening  coding  gain  of  the  WSS  process  y{n)  [11]. 


Proof.  By  definition,  the  coding  gain  of  the  system  is  given  by 


0  opt  —  r.  — 

to 


%pt  (it  IIK  (/:, 

Substituting  (12)  in  (22)  and  simplifying,  we  get 


Gopt  — 


(21) 


(22) 


(23) 


The  integrals  in  both  the  numerator  and  the  denominator  can  be  interpreted  as  the  variance  of  a  WSS  random 
process  with  a  power  spectrum  density  equal  to  Syy(ejMu)\F(eju)\ 2  and  yj Syy (e^MuJ) \ F (eJCJ ) | 2  respectively.  But 
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we  know  that  downsampling  a  WSS  process  produces  another  WSS  process  with  the  same  variance.  Therefore, 


we  can  write 


dcj 

2n 


pt  (iijvsyy(e^me^)n  4-m  t)2 


(24) 


Using  the  fact  that  (Syy(e^)\F(e^)\2)  |M=  Syy(e*“)(\F(e*“)\2)  Im  and  that  (|F(e^)|2)  Ui=  1,  we  get  (21).  ■ 
The  factor  M  in  (21)  is  again  due  to  the  oversampled  nature  of  the  signal  x(n).  It  is  interesting  to  note  that  the 
noise  shaping  contribution  to  Qopt  in  (21),  which  we  denote  by  Ghw,  is  exactly  the  coding  gain  we  would  obtain 
by  half  whitening  the  WSS  process  y(n )  in  the  usual  way  [11].  By  appealing  to  the  Cauchy  Schwartz  inequality 
again,  we  can  show  that  Ghw  >  1  with  equality  iff  the  power  spectral  density  Syy{eiu)  is  a  constant,  i.e.,  y(n)  is 
white  noise.  Therefore,  for  the  particular  system  of  Fig.  6,  we  will  not  get  additional  coding  gain  by  noise  shaping 
if  the  driving  WSS  process  y(n)  in  Fig.  1  is  white  noise.  For  completeness,  we  would  like  to  mention  that  the 
following  expression  for  the  coding  gain  of  Fig.  17  (the  multiband  case)  can  be  derived  under  the  assumption  that 
the  JWSS  processes  y*(n),  A;  =  0, 1, . . . , L  -  1,  are  uncorrelated  : 

£*=o  Syk(e^)\Fk(e^)\2t 


Qopt  — 


(/:.  Sy~WM“)  |fi(e^)lV£^o 


(25) 


7.2.  Using  a  more  general  postfilter 

Consider  now  the  more  general  system  of  Fig.  8  where  the  postfilter  is  not  assumed  to  be  the  inverse  of 
the  prefilter.  The  multiband  case  is  shown  in  Fig.  18.  The  goal  is  to  jointly  optimize  the  prefilter  P(eju)  and 
the  postfilter  V(e^u)  to  again  minimize  the  average  m.s.e  =  l/M^n Lo*  E{x(n)  —  x(n)}2  under  the  following 
assumptions: 

1.  The  input  x(n )  is  assumed  to  be  a  zero  mean  real  wide  sense  stationary  process. 

2.  The  input  x(n)  and  the  quantization  noise  q(n)  are  uncorrelated  processes,  i.e.,  E{x(n)q(m)}  =  0  V  n,m. 

3.  The  quantization  noise  q(n)  is  white  with  variance  cr^  as  in  (7). 

4.  The  filters  P(eju)  and  V(ej(J)  are  not  constrained  to  be  rational  functions  and  can  be  non  causal. 

5.  The  power  spectral  density  Sxx(ejuJ)  is  positive  for  all  u.  Furthermore,  for  the  derivation  of  the  optimum 
prefilter,  we  will  also  require  Sxx(ejlJ)  and  its  first  derivative  to  be  continuous  functions  of  frequency. 

To  solve  the  above  problem,  our  approach  will  be  the  following  :  First,  consider  the  single  band  case  of  Fig.  8. 
Unlike  previous  quantization  schemes,  we  observe  that  in  the  absence  of  the  quantizer,  the  scheme  of  Fig.  8  is  not 
a  perfect  reconstruction  system.  The  error  sequence  e(n)  =  x(n)  —  x(n)  has  in  fact  two  components:  one  due  to 
the  mismatch  between  the  pre-  and  post  filters  and  the  other  due  to  the  filtered  quantization  noise.  We  cannot 
therefore  simply  minimize  the  mean  square  reconstruction  error  before  the  downsampler  as  in  the  previous  sections. 
Using  the  m.s.e  definition  given  above,  we  derive  an  expression  for  the  average  mean  square  reconstruction  error 
1/M Yln=o  E{e2(n)}  in  terms  of  the  filters  and  the  average  power  spectrum  of  the  signal  x(n)  and  noise  q(n). 
The  use  of  the  average  power  spectral  density  of  the  (CWSS)m  input  x(n)  in  this  case  is  not  theoretically  correct, 
even  under  the  same  quantizer  assumptions  as  before.  Nevertheless,  it  is  necessary  to  work  with  this  quantity  to 
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obtain  any  meaningful  comparison  between  this  more  general  set  up  and  the  one  of  the  previous  subsection.  The 
calculus  of  variation  is  used  as  a  tool  to  derive  closed  form  expressions  for  both  the  optimum  pre-  and  post  filters 
which  are  then  used  to  obtain  the  coding  gain  expression  of  Fig.  8.  Finally,  we  will  show  how  to  the  generalize 
the  results  for  the  multiband  case  of  Fig.  18. 


Theorem  7.2.1.  For  a  fixed  prefilter  P(eju)  and  a  given  Biter  F(ejuJ),  the  optimum  postfilter  Vopt(ejuJ)  is: 


- 


Sxxi^0)  + 


c2~2i>  r 

P(ei“)\2  J-K 


Sxx(ejw)\P(en\2^ 


(26) 


\P(ej“)\ 

Proof.  The  average  mean  square  reconstruction  error  can  be  expressed  as  follows: 
1  M-l 

£  =  17  £ 


n=0 


M—l  .  JH-l  1  ***  —*■ 

=  E{x2(n)}  +  £(£2(n)}  E{x{n)x(n)}  -  —  ^  E{x(n)x(n)} 

n= 0  n= 0  n—0 

=  f_  Sxx(ej“)^  +  jj  f  Sxx(en\P(ejnf\V(en\2\F(en\2^ 

+  hf  -  it2  fj^n\p(eju)fnp(ejnv(en}^ 

where  3ft  stands  for  the  real  part.  First,  observe  that  the  average  m.s.e  dependency  on  the  phase  of  the  filters 
appears  only  in  the  last  term.  To  minimize  (27)  with  respect  to  the  phase  of  the  filters,  the  product  P(ePtJ)V ( eju ) 
must  be  zero  phase.  To  see  this,  simply  set  P(ejuJ)  =  \P(eju) and  V(eju)  =  The  real  part 

of  P(eju)V(eju)  is  equal  to  |F(eJtJ)||P(eJa;)|  cos(0(a;)  +  ^(^))-  To  minimize  (27),  cos(<^(lj)  4-  $(u;))  must  be  equal 
to  one.  Dropping  the  real  notation  3ft  in  (27),  we  now  turn  to  the  magnitude  squared  response  of  the  filters.  We 
first  fix  the  prefilter  P{eju)  and  optimize  \V(eju)\.  This  can  be  done  by  applying  the  Euler-Lagrange  equation 
from  the  calculus  of  variation  theory  [15]  to  (27).  The  resulting  expression  is  (26).  ■ 

It  is  interesting  to  note  that  the  post  filter  is  independent  of  F(eJW).  Substituting  (26)  into  (27),  we  obtain  the 
following  average  m.s.e  expression  : 

r  Sxx(e^)(Sxx(e^)\P(e^)\2(M  -  \F(e^)f)  +  c2~2bM  J*n  S**(e>u)|P(e^)|2§ )  du 


M-l 


M-l 


(27) 


£m\b)=  r 

J  —  7T 


5„(e^)|P(e^)|2  +  c2-^f^  Sxx(e^)\P(e^)f^  2tt 

The  above  equation  is  only  a  function  of  the  magnitude  squared  response  of  the  prefilter.  From  this  point  on,  the 

problem  under  study  is  very  similar  to  the  one  analyzed  recently  in  [16]  and  in  fact,  becomes  exactly  the  same 
by  setting  M  and  F(e^u)  to  unity  in  equation  (28).  We  will  therefore  omit  the  proofs  of  the  upcoming  theorems 
referring  the  reader  to  [16]. 

Theorem  7.2.2.  The  squared  magnitude  response  \Popt(elw)\2  that  minimizes  S (|P|2,6),  given  in  (28),  is  also 
the  solution  of  the  following  constrained  optimization  problem: 


mm 

\P(ei”)\* 


f 

J—n 


K  5II(e^)(5^(e^)|P(e^)|2(l  -  |F(e^)|2)  +  c2~26)  rfcj 


subject  to: 


/: 


Sxx(e^)\P(e^')\2  +  c2~2b 


Sxx(en\P(enf^  =  1 


27T 


(29) 


(30) 
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Theorem  7.2.3.  The  prefilter  |Popi(eJ“')|2  that  minimizes  (29)  under  the  constraint  (30)  must  have  a  magnitude 
response  \Popt(ejaJ)f  in  the  following  form: 


I Popt(e3u>)\2  =  max  (0, 


I  F(ejl 


2Lf— 

(du\  V  r* 


1  +  c2~2b 


c2 


-26 


^Sxx(e^)  V-,  V^(^j| F(ef“)\g  Jsxx(ef*>) 


) 


V  we  [-7 r, 7 r] 


(31) 


Theorem  7.2.4.  With  the  optimal  choice  of  pre-  and  postfilters,  the  coding  gain  expression  for  the  scheme  of 
Fig.  8  is 

Gopt  =  (1  +  c2~2b)MQhw  (32) 

as  long  as  \Popt(eju)\2  ^  (31)  is  never  set  to  zero  V  u.  Here ,  Gkw  is  again  the  half  whitening  coding  gain  of  the 
WSS  process  y(n ). 


Note  that  in  this  case  the  coding  gain  of  the  more  general  set  up  is  a  concatenation  of  three  factors  :  Ghw  due  to 
the  noise  shaping,  the  oversampling  factor  M  due  to  the  signal  model  and  1  +  c2~26  due  to  using  a  more  general 
form  of  pre-  and  post  filters. 

To  conclude  this  section,  we  would  like  to  repeat  the  same  procedure  for  the  more  general  scheme  of  Fig.  18. 
We  claim  that,  for  this  case,  the  optimum  postfilter  is  still  given  by  (26)  and  the  optimum  pre-  filter  magnitude 
squared  response  expression  is  obtained  from  (31)  by  simply  replacing  \F(eju})\  by  yj ^2k=o  \Ek(ejuJ)\2-  To  prove 
this,  the  key  is  to  derive  an  expression  for  the  average  mean  square  reconstruction  error  of  Fig.  18.  Clearly,  if  we 
can  show  that  £  for  the  multiband  case  can  be  expressed  as 


1  M— 1  1  M— 1  1  M— 1 

£  =  E{x2(n)}  +  —  S{i2(n)}  -  T7  E{£(n)x(n)}  -  —  ^  E{x(n)x(n)} 


n= 0 


n=0 


n=0 


=  J[  Sxx(en^  +  JL  jT  Sxx(en\P(eju)\2\V(e*“)\2  p  \Fk(e?“)f ^ 

+  ^  P  ine-)|2f  -  £jxx(ei“)P(e?“)V(en  £ 


(33) 


then,  from  the  previous  analysis,  the  above  claim  follows  immediately.  To  derive  (33),  we  need  to  only  consider 
the  second  term  and  one  of  the  cross  terms.  The  second  term  l/MY^n-o  E{x2(n)}  variance  of  the  signal 

estimate  at  the  output  of  Fig.  18.  But  from  result  2  of  section  III,  we  know  that  it  is  equal  to  1/M  &yk  where 
t Tyk  is  the  variance  of  the  signal  estimate  before  the  fcth  channel  downsampler,  A;  =  0, . . . ,  L  —  1.  Substituting  with 
<Tyk  in  this  last  relation,  we  obtain  the  second  and  third  integral  in  (33).  Consider  now  one  of  the  cross  terms,  say 
E{x(n)x(n)}.  We  can  rewrite  x(n)  as  £*(n)  where  £*(n)  is  the  signal  estimate  at  the  output  of 

the  fcth  channel.  By  the  linearity  of  the  expectation,  this  gives  1/M  Ylk=o  ^2n= o*  E{xk(n)x(n)}.  By  interpreting 
the  single  band  case  as  the  kth  channel,  the  last  integral  follows  easily.  Equation  (33)  is  therefore  established  and 
the  claim  is  proved. 

Example  7.1.  Case  of  a  MA(1)  process  y(n).  Assume  that  the  input  x(n)  is  modeled  as  in  Fig.  1  with  M  =  2 
and  F(e?w)  =  -^=(1  +  z ~1)-  Let  the  driving  WSS  signal  y(n)  be  a  zero  mean  gaussian  MA(1)  process  with  an 
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autocorrelation  sequence  in  the  form 


Ryy{k)  “ 


I  \  +  ff2 

lo 


jt  =  o. 

A:  =  1,-1. 
otherwise. 


The  MA(1)  process  has  to  have  <  1/2  to  ensure  that  the  power  spectral  density  is  indeed  non  negative. 

Ryy(0) 

We  therefore  restrict  0  to  be  between  -1  and  1.  The  power  spectrum  of  the  MA(1)  process  is  given  by: 


Syy(eju)  =  1-2 


9 


(1  +  02) 


cos(u) 


(34) 


Substituting  (34)  in  (21),  the  coding  gain  expression  of  the  scheme  Fig.  6  becomes 

2(1  +  02) 


Gopt  — 


(/> 


4-  62  —  26  cos  (u)) 

Z7T  J 


(35) 


The  integral  in  (35)  is  equal  to  F(-0.5  ,  -0.5  ;  1  ;  02)  where  F(a  ,  b  ;  c  ;  d)  is  Gauss’s  hypergeometric  function. 

From  [17],  F(— 0.5  ,  —0.5  ;  1  ;02)  can  be  rewritten  as  (1  4-  0)F(— 0.5  ,  0.5  ;  1  ;  46/(1  4-  9 )2).  This,  in  turn,  can 
2 

be  simplified  to  (1  4-  9)  ~E(2y/(\9\)  /  (1  4-  9))  where  E(.)  is  the  complete  elliptic  integral  of  the  second  kind.  The 

1 T 

coding  gain  of  the  more  general  system  can  be  obtained  by  multiplying  (35)  by  (14-  c2-26)  and  obviously  depends 
on  the  number  of  bits  b.  The  plots  of  the  coding  gain  are  illustrated  in  Fig.  19  for  b  =  3  and  c  =  2.4. 

Example  7.2.  Case  of  an  AR(1)  process  y(n).  With  the  same  assumptions  as  in  example  7.1,  let  the  driving 
signal  y(n)  be  a  zero  mean  gaussian  AR(1)  process  with  an  autocorrelation  sequence  in  the  form  Ryy(k )  = 
where  p  is  between  0  and  1.  The  power  spectrum  of  the  AR(1)  process  is 


Syy(en  = 


1  -P2 


1  -bp2  —  2  pcos(uj) 

Substituting  (36)  in  (21),  the  coding  gain  expression  for  the  scheme  of  Fig.  6  is  as  follows: 


(36) 


Gopt 


(1 


y/(l  +  p2  -  2 pCOs{u)))  27Ty 


(37) 


The  integral  in  (37)  is  equal  to  —  K(p)  where  K(p)  is  the  complete  elliptic  integral  of  the  first  kind  [17].  Again, 

7T 

the  coding  gain  of  the  more  general  system  is  obtained  by  multiplying  (37)  by  (1 4*  c2”26).  The  plots  of  the  coding 
gain  are  shown  in  Fig.  20  for  b  =  3  and  c  =  2.4. 

VIII.  NOISE  SHAPING  BY  (LPTV)m  PRE-  AND  POST  FILTERS 

In  this  section,  we  consider  using  (LPTV)m  pre-  and  post  filters  instead  of  LTI  ones  surrounding  a  periodically 
time  varying  (( PTV)m )  quantizer.  Since  the  signal  model  x(n)  is  ( CWSS)m ,  restricting  ourselves  to  linear  time 
invariant  noise  shaping  filters  and  quantizers  is  a  loss  of  generality.  Any  optimum  configuration  for  such  processes 
should  consist  of  (LPTV)m  filters  surrounding  a  (( PTV)m )  quantizer.  Using  some  well  known  multirate  results, 
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it  can  be  shown  that  this  new  quantization  configuration  is  equivalent  to  an  M-channel  maximally  decimated 
filter  bank  with  M  subband  quantizers  [5].  We  will  further  impose  the  perfect  reconstruction  condition  in  the 
absence  of  quantization  by  confining  ourselves  to  the  class  of  perfect  reconstruction  filter  banks.  It  follows  that 
K(ejuJ)  —  E-1(eJa;)  where  E(eju)  and  R(ej(J)  denote  respectively  the  analysis  and  synthesis  polyphase  matrices 
[17].  Equivalently,  the  analysis  and  synthesis  filters  satisfy  the  biorthogonality  condition:  (Pk(ej<J)Qm(e^))\  |m= 
8(m  -  k)  for  all  jfc,ra.  The  goal  is  then  to  find  the  set  of  M  analysis  and  synthesis  filters,  Pk(ejuJ)  and  Qk(eju) 
(equivalently  the  analysis  and  synthesis  polyphase  matrices),  that  minimize  the  average  mean  square  error  at  the 
output  due  to  the  quantization  noise.  Because  the  general  (LPTV)m  problem  is  difficult  to  track  analytically,  we 
will  only  study  two  special  forms  of  the  above  set  up.  The  first  case  assumes  that  E(eju)  is  diagonal  with  diagonal 
elements  equal  to  Vk{eju).  It  follows  that  R(eju)  is  also  diagonal  with  diagonal  elements  equal  to  for 

each  k.  The  second  case  assumes  that  E(e?u)  is  paraunitary  and  we  choose  R(eJa;)  =  E*(eju).  Alternatively,  the 
synthesis  filters  Qk(e^u)  are  equal  to  Pk(e^u)  for  each  k  and  (Pk(e^UJ)Pm(e^u)) \  Im=  S(m  —  k)  for  all  k,m.  These 
two  special  forms  are  intermediate  between  one  extreme  (the  LTI  case)  and  the  other  (the  general  (LPTV)m 
case). 

8. 1 .  Letting  the  synthesis  filter  be  the  inverse  of  the  analysis  filter 

Let  E(eJtJ)  be  a  diagonal  matrix  with  diagonal  elements  equal  to  Vk(eju)  and  R(eju})  be  also  diagonal  with 
diagonal  elements  equal  to  —  for  each  k.  The  quantization  configuration  is  shown  in  Fig.  9  for  the  single 
band  case  and  Fig.  21  for  the  multiband  case.  The  scalar  quantizers  labeled  Q  are  modeled  as  additive  noise 
sources  g*(n)  and  individually  satisfy  relation  (7).  Throughout  this  section,  we  will  assume  that  the  subband 
quantization  noise  sources  g*(n)  are  white  and  pairwise  uncorrelated,  i.e.,  the  noise  power  spectral  density  matrix 


is  given  by 


S„(e'w)  = 


<  0 
0  < 


The  goal  is  then  to  jointly  allocate  the  subband  bits  6*  under  a  fixed  bit  rate 


-  AT— 1 


and  optimize  I4(eJtJ)  in  order  to  minimize  the  average  m.s.e  at  the  output  of  Fig.  9  and  Fig.  21.  Our  strategy  is 
as  follows:  we  first  find  the  optimum  solution  for  the  single  band  case  of  Fig.  9.  Then,  by  interpreting  the  single 
band  model  as  one  of  the  L  channels  of  the  more  general  multiband  case,  the  optimum  solution  for  Fig.  21  follows. 

Theorem  8.1.1.  Consider  the  scheme  of  Fig.  9  under  the  above  assumptions.  The  optimum  filter  Vopt(e^u)  that 
minimizes  the  average  mean  square  reconstruction  error  at  the  output  is  independent  of  k  and  has  the  following 


magnitude  squared  response: 


I  Vopt  ifi 


Jv\\ 2  _ 


tttfv(e*0 
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where  Syy  (eju )  is  the  power  spectrum  of  the  WSS  process  y{n)  in  Fig.  1 .  With  the  above  optimum  filter  expression, 
the  coding  gain  of  Fig.  9  is  then  given  by  : 


Gopt  — 


M( UkJo1  VS^)\Rk{ei“)V%) 


2/M 


where  Rk(e3U)  is  the  kth  polyphase  component  of  F(eju). 


(41) 


Proof.  Since  the  system  has  the  perfect  reconstruction  property  in  the  absence  of  quantization,  the  error  e(n)  at 
the  output  is  simply  the  filtered  quantization  noise  signal.  After  the  downsampler,  the  filtered  noise  component 
w (n)  is  WSS.  By  result  2  of  section  III,  Z  -  TjVw-  To  compute  <r2,  we  express  the  filter  F(eju)  in  terms  of  its 
M  polyphase  components  Rk(ejw).  Because  the  input  signal  x(n)  is  modeled  as  in  Fig.  1,  we  can  also  invoke  the 
polyphase  identity  (see  [5]  pp.  133)  at  the  input  to  simplify  Fig.  9  to  Fig.  22  (The  interpolation  filter  was  not 
drawn  because  we  are  really  interested  in  evaluating  ct2,  rather  than  <r2 .  Since  the  quantization  noise  sources  are 
assumed  to  be  white  and  uncorrelated,  the  average  mean  squared  error  is  therefore  given  by: 


c  _  c)-2bk  2  r  l^(eJUI)|2  dw 

ykJ-*  iw")^ 

M  —  1  -tt  j 

=  J7  £  2~2bk  /  Syy(en\Rk(en\2\Vk(en\2^ 

m  k= o  J-” 

Using  the  AM-GM  inequality,  equation  (39)  and  the  fact  that  |i?fc(eJU,)|2  =  |fifc(e-?w)|2,  equation  (42)  reduces  to: 


(42) 

r  \Rk(enidw 

-rr  \Vk(e^W  2tt 


M—l  -7T  Jf  fTC 

z  >  c2-26(  n  j_  Syy (eju )\Vk(eju,)\2\Rk(eju)\'2  —  J  ^ 


\Rk{e^)l  dw 
\Vk{e^)Y  2tt 


l/M 

) 


(43) 


Applying  the  Cauchy-Schwartz  inequality  to  each  term  in  (43),  we  get: 


M—\  p-ft  .  j 

Zmin=c 2-26(n  /  JsJ^)\Rk(en\2^)VM  (44) 

This  minimum  bound  is  achieved  by  choosing  |Vropt(eJa;)|2  as  in  (40).  Finally,  (41)  follows  immediately  from  the 
definition  of  the  coding  gain,  equation  (7)  and  the  fact  that  o\  =  cr2/M.  ■ 

The  LTI  case  is  indeed  a  loss  of  generality .  Since  the  class  of  ( LPTV)m  filters  and  (PTV)m  quantizers 
include  the  LTI  case,  it  is  clear  that  the  performance  of  this  more  general  class  of  filters  and  quantizers  is  at 
least  as  good  as  the  LTI  one.  We  have  already  shown  that  the  optimum  (LPTV)m  filter  for  Fig.  9  reduces 
to  a  LTI  one.  The  question  then  becomes  :  Is  the  (PTV)m  quantizer  providing  any  excess  gain  over  the  LTI 
case  and  if  so,  by  how  much  ?  We  show  next  that,  even  in  this  restricted  form  of  (LPTV)m  filters,  the  coding 
gain  of  the  above  scheme  is  always  greater  than  the  LTI  one  except  when  the  magnitude  squared  response  of  the 
polyphase  components  Rk(eju)  of  F(ejiJ)  are  equal  for  all  k.  Starting  from  the  denominator  of  (22)  (the  coding 
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gain  expression  of  Fig.  6),  one  can  write  the  following  series  of  steps: 


»=o 


where  the  last  line  in  (45)  is  the  denominator  of  (41).  Since  the  numerator  is  the  same  in  both  cases,  the  claim 
is  proved.  The  first  equality  in  (45)  is  obtained  by  using  the  power  complementary  property  of  the  polyphase 
components  of  F(ejlJ).  The  second  line  is  a  consequence  of  the  linearity  of  the  integral.  The  third  line  results 
from  applying  the  AM-GM  inequality.  From  the  AM-GM  formula,  we  know  that  equality  is  achieved  if  and  only 
if  all  \Rk(e?a)\2  are  equal.  From  Fig.  22  (which  was  introduced  in  the  proof  of  Theorem  7.1.1),  we  can  see  that 
this  makes  perfect  sense.  If  all  \Rk(eju)\2  are  equal  and  since  the  optimum  filters  Vk(eitJ)  are  independent  of  k, 
the  variance  of  the  subband  quantizer  inputs  will  be  all  equal.  There  is  therefore  no  variance  disparity  in  the 
subbands  and  optimum  bit  allocation  of  the  subband  quantizers  (which  depends  on  the  AM-GM  inequality)  can 
not  produce  any  gain.  Using  the  single  band  result,  we  can  now  derive  closed  form  expressions  for  the  optimum 
Voptk  (fiJ“ )  and  the  average  minimum  mean  squared  error  for  the  multiband  case. 

Theorem  8.1.2.  Consider  the  scheme  of  Fig.  21  under  the  above  assumptions.  The  optimum  filter  Voptk(e3U)  (for 
each  k)  that  minimizes  the  average  mean  square  reconstruction  error  at  the  output  has  the  following  magnitude 
squared  response: 

\Voptk(eju)\2  = 

L-l 

where  Rik(e^u)  is  the  kth  polyphase  component  of  the  ith  filter  Fi(eJW)  and  Sk(e^u)  —  ^  Sy ; (eJ"’ ) |  A; ^ (e?u ) j 2  is 

£=0 

the  power  spectrum  of  kth  channel .  Using  the  above  optimum  filters ,  the  coding  gain  of  Fig.  21  is  then  given  by  : 


Gopt  —  ~  I  2/M 

M(nf=oX  \Rik(ei“)\>£) 


Proof.  By  interpreting  the  single  band  result  as  one  of  the  L  channels  of  the  multiband  model  and  by  using  result 
2  of  section  III,  the  average  mean  squared  error  can  be  expressed  as  follows: 


M  hi  Vk  i  ww)i2  ^ 

-  —  Y"1  2~2bk  r  sk(eju) \Vk(ePu)\2—  C  ^=0’  — 

~M^n2  LSk{6  Wk{£  )l  2n  L  \Vk(e^)\2  2* 


\Vk{e>“)\2 
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Using  the  same  inequalities  as  in  the  proof  of  theorem  8.1.1,  we  can  immediately  derive  (46)  and  (47).  ■ 

Following  the  same  type  of  reasoning  as  before,  we  again  expect  the  coding  gain  of  the  more  general  (LPTV)m 
case  of  Fig.  21  to  be  higher  than  the  analogous  LTI  one  of  Fig.  17.  However,  the  complexity  of  the  expressions 
(25)  and  (47)  in  this  case  prevents  a  formal  mathematical  proof. 

Example  8.1.  Equal  polyphase  components.  Assume  that  the  input  x(n)  is  modeled  as  in  Fig.  1  where  the 
upsampler  M  =  2  and  the  driving  input  y(n)  is  a  zero  mean  gaussian  AR(1)  process  with  correlation  coefficient 
0  <  p  <  1.  Furthermore,  let  F(z)  be  the  optimum  FIR  compaction  filter  of  length  two  given  by  -^=(1  +  z”1).  The 

v  2 

filter  actually  corresponds  to  one  of  the  channels  of  a  2  x  2  KLT  which  is  independent  of  the  input  statistics.  In 

this  case,  the  polyphase  components  of  F(eJCJ)  are  Ro(eju})  =  Ri (eju)  —  -^=.  Substituting  in  (41)  and  simplifying, 

v  2 

we  get  (21),  the  coding  gain  expression  of  Fig.  6.  In  example  7.2.,  a  closed  form  expression  was  derived  for  the 
AR(1)  case  and  a  plot  of  the  coding  gain  is  shown  in  Fig.  20. 

Example  8.2.  Unequal  polyphase  components.  With  the  same  set  of  assumptions  of  example  8.1,  let  the  filter 
F(z)  be  the  optimum  FIR  compaction  filter  of  length  four.  With  M  =  2  and  assuming  an  AR(1)  process,  the 
following  closed  form  expression  was  derived  in  [18]  for  the  optimum  compaction  filter: 

F(z)  =  a  +  cz~l  +  bz~2  +  dz~z  (49) 


where  _ 

c  =  +  - y/y/p- .  d  = 

andp  =  3-bp2,  q  =  2-bp2.  The  polyphase  components  of  F(eju)  are  Ro(ejuJ)  —  a-b&e-^  and  Ri(eju)  =  c+de~joJ. 
Substituting  the  power  spectrum  expression  of  an  AR(1)  process  given  by  (36)  into  (41)  and  using  some  useful 
integral  formulas  (see  [17]  pp.  429),  we  can  derive  the  following  coding  gain  expression  for  the  scheme  of  Fig.  9: 

1 


Qopt  — 


(50) 


2(1  -  p2)l((02  +b2  +  2f)K(p)  -  M£E(p))l((c*  +  cP  +  Zf)K(p)  -  IfE(p)) 

where  K(.)  is  the  complete  elliptic  integral  of  the  first  kind  and  E(.)  is  the  complete  elliptic  integral  of  the 
second  kind.  There  is  a  reason  for  writing  the  denominator  of  (50)  in  this  form.  It  can  be  shown  that  the 
factors  —((a2  -b  62  H - )  K{p) - F(p))  and  —  ((c2  -f  d2  A — ^)K(p) —E(p))  represent  the  variance  of  the 

7T  P  p  7T  p  p 

outputs  i^e^)  and  i?i(eJW)  respectively  (with  an  input  with  power  spectrum  VSyy(e^u)).  Their  product  is  the 
geometric  mean  which  produces  the  extra  gain  over  the  LTI  case.  The  further  away  they  are  in  magnitude,  the 
more  gain  we  will  obtain.  The  plots  of  the  coding  gain  formulas  (37)  and  (50)  are  shown  in  Fig.  23.  We  notice 
that  the  coding  gain  of  the  (LPTV)m  case  is  indeed  greater  than  the  LTI  one  for  all  values  of  p,  although  not  by 
a  substantial  amount  for  the  AR(1)  y(n). 

8.2.  Using  an  orthonormal  filter  bank 

Consider  now  the  M-channel  orthonormal  filter  bank  shown  in  Fig.  10  for  the  single  band  model  and  in  Fig. 
24  for  the  multiband  model.  As  in  the  previous  subsection,  we  first  analyze  the  single  band  case  in  detail  and 
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then  use  the  corresponding  results  to  derive  analogous  expressions  for  the  multiband  case.  The  quantization  noise 
assumptions  of  the  previous  subsection  are  still  true  here.  The  goal  is  again  to  jointly  allocate  the  subband  bits 
bk  under  the  constraint  (39)  and  optimize  the  orthonormal  filter  bank  in  order  to  minimize  the  average  m.s.e. 

Theorem  8.2.1.  Consider  the  scheme  of  Fig.  10  under  the  above  assumptions.  The  synthesis  section  of  the 
optimum  orthonormal  filter  bank  {Pk(eju)}  corresponds  to  choosing  one  of  the  filters,  say  P0(eJOJ)  to  be  equal  to 
F (eJUJ )  and  the  remaining  filters  Pk(oiu),  k  =  1, . . . ,  M  —  1,  to  be  orthogonal  to  Pq (eJ~ ) .  In  this  case,  the  optimum 
orthonormal  filter  bank  reduces  to  Fig.  5  where  the  quantizer  Q  is  allocated  Mb  bits  according  to  (37). 


Proof.  By  applying  the  blocking  operation  and  using  the  polyphase  representation  [5],  the  scheme  of  Fig.  10  can 
be  redrawn  as  in  Fig.  25,  where  E(eiw)  is  the  polyphase  matrix  of  the  analysis  bank,  Et(e**')  is  the  polyphase 
matrix  of  the  synthesis  bank  and  Rk(eju),  k  =  0, . . . ,  M  - 1,  are  the  M  polyphase  components  of  the  filter  F(e]U). 
Let  U(eJW)  be  the  1  x  M  vector  whose  kth  element  is  Rk(eju).  Then,  the  average  m.s.e  can  be  expressed  as  follows: 

S  =  troce( U(e^)E(e^)SqqEt(e^)Ut(e^))^  (51) 

Since  the  integrand  is  in  a  quadratic  form,  the  trace  operator  can  be  removed.  Furthermore,  since  E{eju)E\eju)  = 
I  by  orthonormality  and  U(eJa;)Ut  (eju))  =  1  by  the  Nyquist  property  of  the  F(eju),  we  can  rewrite  (51)  as  follows: 

1  r  P(e^)SqqPt (e*")du  m 

MJ_n  P(e*w)Pt(e*")  2tt 


where  P(eju)  =  XJ(ejuJ)E(ejuJ).  Since  the  integrand  of  (52)  is  positive  for  all  w,  minimizing  (52)  is  equivalent 

,  P(e^°)SqqPt(e^°)  . 

to  minimizing  the  integrand  at  each  frequency.  But  for  any  fixed  frequency  u) o,  the  ratio  p(ejwo)pt(ei^o)  1S 
a  Rayleigh  quotient.  For  each  frequency  u,  the  minimizing  vector  Popt(eju)  has  the  form  (0  ...  1  ...  0) 

where  the  1  in  the  zth  position  corresponds  to  the  minimum  noise  variance  cr2. .  Since  P(eJa?)  =  U(ejw)E(eju),  the 
minimizing  vector  Popi(eju)  can  be  obtained  by  setting  the  zth  column  in  E(ejuJ)  to  be  equal  to  and  all 

the  remaining  columns  to  be  orthogonal  to  U^).  This  is  equivalent  to  the  statement  of  the  theorem.  ■ 

The  optimum  orthonormal  filter  bank  thus  reduces  to  the  scheme  of  Fig.  5  with  Mb  bits  allocated  to  the  quantizer. 
The  result  of  Theorem  8.2.1  is  very  intuitive  and  somehow  expected:  filter  and  decimate  the  oversampled  signal 
x(n)  according  to  its  model  and  then  quantize  y(n)  in  Fig.  5  with  6  =  Mb  bits  per  sample.  As  we  mentioned 
before,  this  amounts  to  fixing  the  bit  rate  (number  of  bits  per  second)  in  order  to  trade  quantization  resolution 
with  sampling  rate.  It  is  interesting  though  to  see  that  this  very  intuitive  scheme  is  equivalent  to  using  an  optimum 
orthonormal  filter  bank  as  a  sophisticated  quantizer  to  the  input  x(n).  With  (7)  in  mind,  the  coding  gain  expression 
can  be  derived  following  the  lines  of  the  proof  of  Theorem  5.1  and  is  equal  to  22fc(M”1).  This  is  an  exponential 
gain  which  can  be  quite  large  for  moderate  values  of  M  but  unlike  all  previous  schemes,  depends  on  the  bit  rate 
b.  Finally,  to  end  this  section,  we  would  like  to  derive  an  analogous  result  (to  Theorem  8.2.1)  for  the  multiband 
case. 
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Theorem  8.2.2.  Consider  the  scheme  of  Fig.  24  under  the  same  assumptions.  The  synthesis  section  of  the 
optimum  orthonormaJ  filter  bank  corresponds  to  choosing  L  of  the  filters  to  be  equal  to  Fk(eju)  and  the  remaining 
filters  Qk(eju),  k  =  L  4- 1, . . . ,  M  -  1,  to  be  the  M  -  L  -  1  orthogonal  filters  to  Fi(eju),  i  =  0, . . . ,  L  -  1.  In  this 
case ,  the  optimum  orthonormal  filter  bank  reduces  to  Fig.  16  with  an  equivalent  average  number  of  bits  b  equal 
to  Mb/L  bits. 

Proof.  By  interpreting  the  single  band  result  as  one  of  the  L  channels  of  the  multiband  model  and  by  using  result 
2  and  equation  (39),  the  result  follows  immediately.  ■ 

With  the  above  6,  we  can  now  perform  an  optimum  allocation  of  subband  bits  for  the  scheme  of  Fig.  16.  This  is  a 
standard  allocation  problem  that  arises  in  subband  coding  application  [11].  By  applying  the  AM-GM  inequality 


L- 1 


to  the  output  error  expression  £  —  c—  ^  2  2hko^k ,  we  get 


k= o 


(53) 


i=0 

L—l 


which  can  be  achieved  by  setting  bk  —  b  4-  0.51og2  (Tyk  —  0.51og2  JJ  (cr ^.)  '  .  This  optimum  bit  allocation  formula 

t=0 

will  in  almost  all  cases  yield  non  integer  solution  for  the  bits.  A  quick  remedy  might  be  to  use  a  simple  rounding 
procedure  or  a  more  sophisticated  algorithm  [19]  to  obtain  integer  solutions.  A  detailed  discussion  of  the  topic  of 
allocating  integer  bits  to  the  channel  quantizers  is  however  outside  the  scope  of  the  paper.  The  noise  variance  in 


LA 


L—l 


Fig.  12  simplifies  to  c2  2*  ayk)'  The  coding  gain  expression  takes  therefore  the  following  form: 


k= o 


n  —  r,2b(^f-i)  AM(oy.) 

6opt~ 2  GMKJ 


(54) 


where  AM  is  the  arithmetic  mean,  GM  is  the  geometric  mean  and  is  the  variance  of  the  zth  signal  yi(n)  in 
Fig.  2.  We  observe  that  when  L—  1,  we  get  the  coding  gain  of  the  single  band  case  and  when  L  =  M,  the  scheme 
of  Fig.  16  reduces  to  an  orthonormal  filter  bank,  the  average  number  of  bits  is  equal  to  b  and  (54)  reduces  to  the 
well  known  expression  of  the  coding  gain  of  an  orthonormal  filter  bank. 

Appendix  A. 

Proof  of  result  1  in  section  III.  The  interpolated  subband  signals  can  be  expressed  as  Xi(n)  =  - Mk). 

Hence, 


E[xi(n)xj*(n  —  Mm)]  =  E 


-  Mk')Yyj*m'(n  -  Mk  -  Mm) 


L  k' 


(55) 


Let  r{u)  be  the  cross  correlation  between  the  jointly  WSS  processes  j/»(n)  and  yj(n),  that  is,  r(u)  =  E[yi(n)yj*(n  — 
u).  Using  the  change  of  variable  k'  —  k  =  l,  the  preceding  equation  becomes: 


E[xi(n)xj* (n  -  Mm)]  =  Yr^Y^n  ~  Mk')fj*(n  +  M(l  -  m)  -  Mk') 


(56) 
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Substituting  (56)  in  the  left  hand  side  of  (3),  we  get: 


M- 1 

77  Er(0  E  E  -  ( Mk'  +  *))//(»  +  (^  +  *))  (57) 

i  Jfe'  fc=0 

Since  Af  is  positive,  k'  and  k  are  integers  and  0  <  k  <  M,  we  can  always  replace  Mk'  +  k  by  an  integer  u.  That 
is,  there  always  exist  an  integer  u  such  that  k'  is  the  quotient  and  k  is  the  remainder  obtained  from  dividing  u  by 
M .  We  can  therefore  rewrite  (57)  as  follows: 

E  E r(0  E  /<(»  -  «)//("  +  M(l-m)-u)=jjYl r(0  E  /•(*)//(*  +  -  m ))  (5g) 

l  u  Ik 

But  the  orthonormality  of  the  filter  bank  implies,  in  particular,  that  fi(k)fj*(k  +  M (l  —  m))  =  0  VZ,ra. 

k 

Thus,  the  inner  sum  in  (58)  reduces  to  zero  and  the  result  follows.  ■ 

Appendix  Be 

Phase  randomization  of  a  (CWSS)m  process.  A  WSS  process  x(n)  can  be  obtained  from  a  (CWSS)m  process 
x(n)  by  introducing  a  random  shift  6  in  the  (CWSS)m  signal  x(n)  [12], [13], [14].  The  parameter  6  is  a  discrete 
random  variable  that  can  take  any  integer  value  from  0  to  M  —  1  with  equal  probability  1/M.  Furthermore,  the 
random  variable  0  is  assumed  to  be  independent  of  x(n).  The  autocorrelation  function  of  x(n)  is  given  by: 


Rxx(n,k)  =  E{x(n)x(n  —  A;)}  =  Ee{E{x{n  —  0)x(n  —  k  —  0)|0}}  =  Eg{Ryy(n  —  0,k)} 

oc  ..  M— 1  i  M+n— 1 

=  ^2  Rxx{n-6,k)p{6)  =  —  E  Rxx(n-0,k)  =  —  ^  Rxx(m,k) 


(59) 


0=— OO 


0=0 


Now  observe  that 

M+n— 1 

M 


*  lVl-\-n—k  M—  1  1  ivi  T  Jl—x  1  rt— -L 

—  ^  Rxx(mfk)  =  —  ^  Rxx(m, k)  +  —  i?xx(m,A:)  —  —  Rxx(m,k) 

~n  m= 0  m=M  m= 0 

^  M-l  ^  M+n— 1  ^  M+n— 1 

“  jj7  E  Rix(m,k)  +  —  Rxx(m,k)~—  ^  Rxx{m,k) 

m= 0 
^  M-l 

=  E  Rxx(m,k) 


M+n— 1 


n— 1 


m=M 


m~M 


(60) 


m=0 

The  second  line  follows  because  J?xx(m,  k)  =  Rxx(m  +  M,  A;)  by  cyclostationarity.  The  last  sum  is  independent  of 
n  implying  that  Rxx(n,k )  is  a  function  of  A;  only  and  that  the  process  x(n)  is  indeed  WSS.  Furthermore, 

1  M-l 

Rxx(k)  =  —^2Rxx(n,k)  (61) 


n=0 


Appendix  C. 

Proof  of  equation  (13).  Let  x(n)  be  a  (CWSS)m  process  input  to  a  linear  time  invariant  filter  P(ejuJ).  The  output 
z(n)  is  a  (CWSS)m  process  [8]  and  is  related  to  x(n)  by  the  well  known  convolution  sum  z{n)  =  JEp(i):r(n  —  i). 
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Our  goal  is  to  derive  an  expression  for  the  average  variance  of  the  (CWSS)m  process  z(n).  So, 


1  M—l  i 

=  Jf  £  B{lz(n)l2}  =  M  £  ££*K*)P*  (j)E{x(n-i)x*(n-j)} 

n=0  n=0  i  j 

1  M“1  _ 

=  ££p(*)p‘(.7')t7  X]  Rx*(n’i  -  *)  =  ££p(*)p* (j)Rxx(j  -i) 


M—l 


(62) 


t  j  n— 0  t  j 

where  the  last  equality  follows  from  equation  (61).  By  making  the  change  of  variables  j  -  i  =  l,  we  get  : 

*2  =  ££k?  -  i)p*mxx«)  =  ££mojU(Q 

i  j  ii 


(63) 


where  rp(Z)  =  X]*P*(fcMA;  -  n)  is  the  deterministic  autocorrelation  of  p(n).  Taking  the  discrete  time  fourier 
transform  of  (63),  we  get  (11).  ■ 

Appendix  D. 

Power  spectral  density  of  an  interpolated  random  process.  Let  y(n)  be  a  wide  sense  stationary  (WSS)  random 
process,  input  to  an  interpolation  filter  as  shown  in  Fig.  1.  The  output  x(n)  is  in  general  a  (CWSS)m  process 
[8].  The  average  power  spectral  density  of  the  “stationarized”  process  has  the  form 


Sxx(en  =  j;Syy(e*“M)\F(en\‘2 

To  derive  (64),  we  can  use  (61)  to  write 

1  M—l 

Rxx{k)  =  T7  £  £  Ryy(i  ~  j)  £  /(«  -  Mi)f(n  -k-  Mj) 


(64) 


(65) 


i  j  n= 0 

Making  the  consecutive  change  of  variables  i  —  j  =1  and  n  —  Mi  =  u ,  equation  (65)  simplifies  to: 

1  M—l— Mi 

=  17  £*»(*)  £  £  /(«)/(«-  k  +  Ml) 


i  u=—Mi 


=  jj  £  Ryyil)  £  /(«)/(«  ~{k-  Ml))  =  RwV>f{k  -  Ml) 


(66) 


l  U  l 

where  77(71)  is  the  deterministic  autocorrelation  of  f(n)  as  defined  in  appendix  C.  Equation  (66)  can  be  inter¬ 
preted  as  passing  the  autocorrelation  sequence  jjRyy(n)  through  the  interpolation  filter  77(71).  Taking  the  fourier 
transform  of  (66),  we  obtain  (64)  or  equivalently  (12).  The  expression  for  multiband  case,  equation  (15),  can  be 
obtained  in  a  similar  fashion.  Again,  from  (61),  one  can  write: 


1  M—i  1  M—l 

Rxx(k)  =  —  £  E{x(n)x*(n  -  k)}  =  —  £  £  £  f(n  -  Mi)E{y{n)y^  (n  -  A)}f^(7i  —  k  —  Mj) 


n=0 

M—l 


n— 0  i  j 


(67) 


=  ^£££  f  (71  —  Mi)Ry(fc)ft  (n  —  k  —  Mj) 


n= 0  i  j 


where  f(n)  =  ( f0  ( n )  f\  ( n ) 


fi-i(n)  )T  and  Ry  (A')  is  the  autocorrelation  matrix  of  the  L  WSS  inputs  yjfe(n). 


By  following  the  same  steps  used  to  derive  (64),  we  obtain  (13). 
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